TwinCG: Dual Thread Redundancy with Forward Recovery for Conjugate Gradient Methods

نویسندگان

  • Kiril Dichev
  • Dimitrios S. Nikolopoulos
چکیده

Even though iterative solvers like the Conjugate Gradients method (CG) have been studied for over fifty years, fault tolerance for such solvers has seen much attention in recent years. For iterative solvers, two major reliable strategies of recovery exist: checkpoint-restart for backward recovery, or some type of redundancy technique for forward recovery. Important redundancy techniques like ABFT techniques for sparse matrixvector products (SpMxV) have recently been proposed, which increase the resilience of CG methods. These techniques offer limited recovery options, and introduce a tolerable overhead. In this work, we study a more powerful resilience concept, which is redundant multithreading. It offers more generic and stronger recovery guarantees, including any soft faults in CG iterations (among others covering ABFT SpMxV), but also requires more resources. We carefully study this redundancy/efficiency conflict. We propose a fault tolerant CG method, called TwinCG, which introduces minimal wallclock time overhead, and significant advantages in detection and correction strategies. Our method uses Dual Modular Redundancy instead of the more expensive Triple Modular Redundancy; still, it retains the TMR advantages of fault correction. We describe, implement, and benchmark our iterative solver, and compare it in terms of efficiency and fault tolerance capabilities to state-of-the-art techniques. We find that before parallelization, TwinCG introduces around 56% runtime overhead compared to standard CG, and after parallelization efficiently uses BLAS. In the presence of faults, it reliably performs forward recovery for a range of problems, outperforming SpMxV ABFT solutions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of frames in Chebyshev and conjugate gradient methods

‎Given a frame of a separable Hilbert space $H$‎, ‎we present some‎ ‎iterative methods for solving an operator equation $Lu=f$‎, ‎where $L$ is a bounded‎, ‎invertible and symmetric‎ ‎operator on $H$‎. ‎We present some algorithms‎ ‎based on the knowledge of frame bounds‎, ‎Chebyshev method and conjugate gradient method‎, ‎in order to give some‎ ‎approximated solutions to the problem‎. ‎Then we i...

متن کامل

Handwritten Character Recognition using Modified Gradient Descent Technique of Neural Networks and Representation of Conjugate Descent for Training Patterns

The purpose of this study is to analyze the performance of Back propagation algorithm with changing training patterns and the second momentum term in feed forward neural networks. This analysis is conducted on 250 different words of three small letters from the English alphabet. These words are presented to two vertical segmentation programs which are designed in MATLAB and based on portions (1...

متن کامل

A New Hybrid Conjugate Gradient Method Based on Eigenvalue Analysis for Unconstrained Optimization Problems

In this paper‎, ‎two extended three-term conjugate gradient methods based on the Liu-Storey ({tt LS})‎ ‎conjugate gradient method are presented to solve unconstrained optimization problems‎. ‎A remarkable property of the proposed methods is that the search direction always satisfies‎ ‎the sufficient descent condition independent of line search method‎, ‎based on eigenvalue analysis‎. ‎The globa...

متن کامل

Extensions of the Hestenes-Stiefel and Polak-Ribiere-Polyak conjugate gradient methods with sufficient descent property

Using search directions of a recent class of three--term conjugate gradient methods, modified versions of the Hestenes-Stiefel and Polak-Ribiere-Polyak methods are proposed which satisfy the sufficient descent condition. The methods are shown to be globally convergent when the line search fulfills the (strong) Wolfe conditions. Numerical experiments are done on a set of CUTEr unconstrained opti...

متن کامل

Sketching Meets Random Projection in the Dual: A Provable Recovery Algorithm for Big and High-dimensional Data

We provide a unified optimization view of iterative Hessian sketch (IHS) and iterative dual random projection (IDRP). We establish a primal-dual connection between the Hessian sketch and dual random projection, and show that their iterative extensions are optimization processes with preconditioning. We develop accelerated versions of IHS and IDRP based on this insight together with conjugate gr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1605.04580  شماره 

صفحات  -

تاریخ انتشار 2016